Idiomatic Expression Paraphrasing without Strong Supervision

نویسندگان

چکیده

Idiomatic expressions (IEs) play an essential role in natural language. In this paper, we study the task of idiomatic sentence paraphrasing (ISP), which aims to paraphrase a with IE by replacing its literal paraphrase. The lack large-scale corpora idiomatic-literal parallel sentences is primary challenge for task, consider two separate solutions. First, propose unsupervised approach ISP, leverages IE's contextual information and definition does not require training set. Second, weakly supervised using back-translation jointly perform generation IEs enlarge small-scale dataset. Other significant derivatives include model that replaces phrase generate expression large scale dataset idiomatic/literal pairs. effectiveness proposed solutions compared competitive baselines seen relative gains over 5.16 points BLEU, 8.75 METEOR, 19.57 SARI when generated are empirically validated on automatic manual evaluations. We demonstrate practical utility ISP as preprocessing step En-De machine translation.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Road Detection and Semantic Segmentation without Strong Human Supervision

Recently, convolutional neural networks (CNNs) trained with strong human supervision have shown to achieve state of the art performance for both road detection and semantic segmentation. However, collecting strongly labeled data for both require detailed per-pixel annotations from humans which renders data annotation highly costly and time consuming. Therefore, in this work we propose methods t...

متن کامل

Type-based Search of Idiomatic Expression

This paper presents evaluation of different approaches to extract verb-noun idiomatic expressions in Czech. These approaches are based on the structure of the idiom and its behavior in language. PMI and syntactic and lexical fixedness modified using VerbaLex and generated thesaurus provide useful tool for choosing best idiomatic candidates for manual annotation and evaluation. Moreover we focus...

متن کامل

Least Squares Estimation Without Priors or Supervision

Selection of an optimal estimator typically relies on either supervised training samples (pairs of measurements and their associated true values) or a prior probability model for the true values. Here, we consider the problem of obtaining a least squares estimator given a measurement process with known statistics (i.e., a likelihood function) and a set of unsupervised measurements, each arising...

متن کامل

Multi-View Data Generation Without View Supervision

The development of high-dimensional generative models has recently gained a great surge of interest with the introduction of variational auto-encoders and generative adversarial neural networks. Different variants have been proposed where the underlying latent space is structured, for example, based on attributes describing the data to generate. We focus on a particular problem where one aims a...

متن کامل

Expressing Implicit Semantic Relations without Supervision

We present an unsupervised learning algorithm that mines large text corpora for patterns that express implicit semantic relations. For a given input word pair Y X : with some unspecified semantic relations, the corresponding output list of patterns m P P , , 1 is ranked according to how well each pattern i P expresses the relations between X and Y . For example, given ostrich = X and bird = Y ,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2022

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v36i10.21433